Skip to content

Collapse source class tables into one SOURCES registry#102

Merged
jirhiker merged 1 commit into
chore/config-validation-dedupfrom
feature/source-registry
Jun 29, 2026
Merged

Collapse source class tables into one SOURCES registry#102
jirhiker merged 1 commit into
chore/config-validation-dedupfrom
feature/source-registry

Conversation

@jirhiker

Copy link
Copy Markdown
Member

Second of two config-system improvements. Stacked on #101 (base is chore/config-validation-dedup) — review/merge that first, then this rebases onto main cleanly.

Problem

SOURCE_DICT and the two *_SOURCE_PAIRS tables each repeated the same connector classes, so adding a source meant editing three tables in lockstep.

Change

All three derive from a single SOURCES registry:

@dataclass(frozen=True)
class SourceDef:
    key: str
    site: type
    waterlevel: type | None = None
    analyte: type | None = None

SOURCES = (
    SourceDef("bernco", BernCoSiteSource, waterlevel=BernCoWaterLevelSource),
    SourceDef("bor", BORSiteSource, analyte=BORAnalyteSource),
    ...
)
SOURCE_DICT             = {s.key: s.site ...}
WATERLEVEL_SOURCE_PAIRS = {s.key: (s.site, s.waterlevel) for s with waterlevel}
ANALYTE_SOURCE_PAIRS    = {s.key: (s.site, s.analyte)    for s with analyte}

Adding a source → one SourceDef entry (plus listing it under the parameters it serves in PARAMETER_SOURCE_MAP, which stays as authored data — it encodes which analytes each agency actually reports, which can't be inferred from class wiring).

Desync guard

tests/test_source_registry.py ties the registry to PARAMETER_SOURCE_MAP: the waterlevels agency list must equal the set of sources with a waterlevel class, and every analyte agency must have an analyte class. So a source wired in one place but not the other fails a test instead of silently dropping out of a parameter's source list.

Verification

  • Iteration order of water_level_sources()/analyte_sources() is now source-key order (was hand-curated); 306 passed confirms nothing depended on the old order.
  • dg check defs clean.

🤖 Generated with Claude Code

SOURCE_DICT and the two *_SOURCE_PAIRS tables each repeated the same connector
classes, so adding a source meant editing three tables in lockstep (plus the
orchestration list, fixed in the prior change). They are now derived from a
single `SOURCES` registry of `SourceDef(key, site, waterlevel?, analyte?)`:

    SOURCE_DICT             = {s.key: s.site ...}
    WATERLEVEL_SOURCE_PAIRS = {s.key: (s.site, s.waterlevel) for s with waterlevel}
    ANALYTE_SOURCE_PAIRS    = {s.key: (s.site, s.analyte)    for s with analyte}

Adding a source is now one SourceDef entry (plus listing it under the
parameters it serves in PARAMETER_SOURCE_MAP, which stays as authored data —
it encodes which analytes each agency actually reports).

tests/test_source_registry.py ties the registry to PARAMETER_SOURCE_MAP: the
waterlevels agency list must equal the set of sources with a waterlevel class,
and every analyte agency must have an analyte class — so a source wired in one
place but not the other fails a test instead of silently dropping out.

Iteration order of water_level_sources()/analyte_sources() is now source-key
order (was a hand-curated order); full suite (306) confirms nothing depends on
the old order. dg check defs clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 29, 2026

Copy link
Copy Markdown

Your pull request is automatically being deployed to Dagster Cloud.

Location Status Link Updated
die-orchestration View in Cloud Jun 29, 2026 at 08:22 AM (UTC)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant